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ABSTRACT 

The study reviewed the formats and psychometric 
rationale of several alleged culture-fair tests. 
Advantages and disadvantages of each instrument were 
examined and implications for compensatory education 
were discussed. 
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THE IMPOSSIBLE DREAM: A CULTURE-FREE TEST 



Although tests have been in existence informally for 
millenia, the quest for a culture-free test has only been 
in the offing since 1926. When it was apparent that sub- 
jects from the lower socioeconomic strata consistently 
scored poorly on the conventional type of intelligence test, 
e.g., Stanford-Binet (SB) , an attempt was made to get at 
this nebulous "intelligence" sans the frills of culture as 
if culture were the culprit causing the attainment of low 
scores on intelligence tests. 

Actually the first formal indication that intelli- 
gence testing was about to begin occurred in 1896 when 
Galton conducted his study of genius. He found that men 
of noted intellect in Britain emanated from just a’ few 
families within the country. He attributed the trend of 
the data to heredity, for he neglected to distinguish one 
of the intervening variables in intelligence, an enriched 
environment, which these families offered their members 
(Barclay, 1968). 

The genesis of the culture-free test occurred in 
1926 when Davey discovered that pictorial "tests of intel- 
ligence" involved G which Spearman held to be the very 
essence of intelligence, analytical ability. Spearman 
himself constructed a Visual Perception Test which even 
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eliminated verbal instructions by using pantomime. He 
found that such perceptual tests were highly saturated with 
the G factor. In 1938 Raven's Progressive Matrices appeared, 
and two years later Cattell's Culture Fair Tests evolved 
(Cattell, 1940). Other tests which are generally billed as 
culture free are the Porteus Maze, Goodenough-Harris Drawing 
Test, Lietner International Performance Scale, Davis-Eells 
Games, and the non-verbal sections of such tests as the 
Lorge-Thorndike . 

A Look at the Alleged Culture Free Tests 
Raven's Progressive Matrices (PM) measures the abil- 
ity to perceive relationships. The test can be given to 
children from five and a half years old to adulthood, and 
the test can be given either individually or in groups. 

The task is simply to select the design that completes the 
pattern, so directions can even be given in pantomime. The 
test is so flexible that one may speed the test or not 
depending upon the situation. The test can be given on a 
formboard for very young children, and the task is thereby 

simplified, for he need only choose a block and put it in 

* 

the blank. Many tests use this format, but not the same 
items since there is the chance that subjects may have 
taken the test before and learned the items. Because of 
the many arbitrary administration procedures and general 
lack of standardization, one cannot compare results from 
various administrations. 
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The virtues of the PM are that it uses concepts 
familiar to most, and it is less dependent on education 
than most intelligence tests. Its greatest fault is that 
it samples a narrow group of abilities, so one must use 
some type of vocabulary test to augment its lack of know- 
ledge concerning verbal ability. Although it may measure 
pure G, it cannot predict as well as a composite of pure G 
plus verbal, spatial, and other type tests which may give 
an indication of the skills needed for the job or course 
for which one wants to predict. The matrix test does show 
the ability of the subject to direct attention to informa- 
tion, process it, and regulate thought even in those who 
have not developed reading and verbal skills fully. The 
empirical evidence shows that those who have not had a 
great deal of education generally have lower scores, how- 
ever, than those with more education. Barrett's form of 
1956 was found to correlate .75 with the full-scale WISC 
(Cronbach, 1970). In a study of 30 bilingual Hispanameri- 
can students (8-13 years old) , the Raven Colored Progres- 
sive Matrices appeared to be the best predictor of school 
success among other non-verbal and verbal intelligence 
tests. It was hypothesized that perceptual-motor skills 
were being used by these children in lieu of verbal skills 
(Philippus, 1967). 

The Lorge-Thorndike (LT) has five verbal and three 
non-verbal subtests each of which have time limits all 
totaling 62 minutes plus time for instruction. Although it 
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has time limits, it has been found that when such we re not 
imposed extra points were not added to the scores. The test 
is constructed in five levels so that a subject is not given 
impossibly difficult items nor too easy ones. There are two 
separate booklets : One for kindergarten through second grade 

and the other for grades three through twelve. The non-verbal 
subtests consist of classifications, analogies, and number 
series which together bear close resemblance to Spearman's G. 
The correlation between the non-verbal section and the PM is 
.63, and between it and the WISC Performance the correlation 
coefficient is .70. The SAT mathematical score and the non- 
verbal section correlate .70, and furthermore Cronbach (1970) 
sees this part of the LT as being equal to Cattell's Culture 
Fair Test in every respect. However, it is interesting to 
note that the manual advises one to administer lower levels 
to lower socioeconomic groups than to higher socioeconomic 
groups. This seems to imply some cultural factor operating 
here . 

Cattell's Culture Fair Test (IPAT CF) has three scales, 
the first of which is designed for six to eight year olds, 
the second for ten to fifteen year olds, and third for seven- 
teen to eighteen year olds. The tests' various subtests 
supposedly tap both crystallized and fluid general abilities 
of which Cattell and Butcher (1968) say that crystallized 
ability depends on culture, but fluid ability shows adapta- 
bility to new situations where crystallized skills are not 
applicable. Spearman's G seems to approximate Cattell's 
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fluid ability. 

Piaget's distinction between instruction and construc- 
tion seems to parallel the distinction between crystallized 
and fluid ability. Jensen likewise partitions learning into 
two constructs. Level 1 and Level 2 types of learning abil- 
ities approximates Piaget's instruction-construction scheme. 
Level 1 involves the retention of input and the productive 
capacity of repetition. Level 1 has been called associative 
learning and Level 2 has been called conceptual or problem 
solving ability. Level 2 learning involves the manipulation 
and transformation of material which sounds very much like 
the assimilation-accommodation process which Piaget has 
termed cons truction . 

Scale 1 of the IPAT consists of substitution, classi- 
fication, mazes, selecting named objects, following direc- 
tions, wrong pictures, riddles, and similarities. In various 
combinations, these subtests can be constructed into tailor- 
made tests for individual or group administration, or to get 
scores specifically for fluid or crystallized ability. There 
are two equivalent forms , each of which take 35 minutes to 
administer completely. Scale II takes 25 minutes for admin- 
istration and 25 minutes for instructions, and there are also 
two equivalent forms for this scale. There are four subtests 
— series, classifications, matrices, and topology — which 
supposedly tap fluid general ability. A special test was 
devised to be used to supplement Scale II in an effort to tap 
crystallized general ability. Scale III has the same format 
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as Scale II, but qualitatively more difficult items, and can 
be used in conjunction with the Cattell-Scale III as a pre- 
dictor of graduate school success. Cattell attempted to use 
content in his tests which is universal in nature and "over- 
learned," e.g., parts of the human body. He opposes strictly 
performance tests on the grounds that they avoid knowledge 
and verbal skills thereby losing knowledge itself (Cattell, 
1940) . 

Cattell found that the asymptotic level of the IPAT CF 
coincided with that of biological maturation, whereas the 
asymptotic level of the traditional intelligence tests seem 
to parallel the age when formal schooling terminates. He 
further maintains that the norms remain more constant than 
those of the conventional tests which he advocates should be 
restandardized every two or three years in an effort to keep 
up with cultural changes. The traditional tests have been 
found to be more accurate than the IPAT CF over the short 
term, but the IPAT CF is more accurate over the long run. 
Although the fluid ability is considered of fundamental im- 
portance, it is remarked that both fluid and crystallized 

* 

abilities are necessary for efficient intellectual perfor- 
mance, and "In real life, it will also depend to some extent 
on personality and motivation factors..." (Cattell & Butcher, 
1968, p. 21) . 

Dickinson (1968) used the IPAT CF in an experiment 
with first graders in which he had an experimental group 
trained in the skills of classifying things on five different 
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levels of sophistication over a period of eight months. The 
control group went up in achievement as measured by the 
Stanford Achievement Test, but the experimental group scored 
significantly higher on the IPAT CF and on the California 
Short-Form Test of Mental Maturity (full scale).. This type 

of data seems to be a serious challenge to the idea of fluid 

1 

general ability being more fundamental than crystallized 
since one can teach fluid ability as well; it just seems to 
be a different area of knowledge that has to be. taught-- 

I 

process rather than content. 

The Goodenough-Harris Drawing Test asks that the sub- 
ject draw the best maxi and woman that he is able to. The 
test covers the age ranges of preschool to fifteen years of 
age, and Cronbach (1970) suggests that it be used to supple- 
ment the WISC and SB, but not as a substitute. By virtue of 
the format, the test is simple to administer and scoring 
rules have been carefully prepared. This test has been found 
to be bound with some cultural influences for very divergent 
cultures, e.g., Moslem Arabs have a taboo regarding drawing 
of images, and the Hopi men produce the ceremonial art and 
thereby have IQs one standard deviation above those of the 
women (Hunt, 1969) . Some subcultures do not allow for as 

much opportunity to draw as others, and in conjunction with 

1 

this fact separate Negro norms are provided for ijnore discrim- 
inating evaluations. 

In an experiment using first graders from Anglo-Saxon 
and Spanish backgrounds, the Drawing Test, the LT Form A, 
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and the California Achievement Test (CAT) Form W were adminis- 
tered in an effort to locate a test which minimized cultural 
bias. It was found that the Drawing Test and the LO were 
nearly identical in predicting the CAT, and that furthermore 
the Drawing Test tended to bring the two groups closer to- 
gether in the IQ distribution (Schroeder & Bemis, 1969). 
Anastasi and deJesus (1953) found that mean IQs of Puerto 
Rican nursery school children in New York City were not sig- 
nificantly different from mean scores previously established 
for white and Negro children from the same neighborhood as 
measured by the Drawing Test. The empirical evidence speaks 
well for the Drawing Test, but one must recall Cronbach's 
caveat to use it as a supplement and not as a substitute for 
the conventional intelligence tests. 

The Davis-Eells Games consist of pictures with oral 
instructions given by the examiner. The contents consist of 
everyday experiences of children, which include probability, 
money, "best ways," and analogies, and the vocabulary is 
supposedly common to all urban American children. The items 
were selected on the basis of reasonableness of problems as 
indicators of general problem-solving, ability. The test is 
not speeded, for it was felt that speed is culturally con- 
taminated since it tests quick recall vis-k-vis the ability 
to solve problems. The examiner is instructed to show warmth 
and encouragement to the subjects. There are two forms: The 

Primary Test for grades one and two and the Elementary Test 
for grades three through six (Ahmann, Glock, & Wardeberg, 
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1960). Coleman and Ward (1955) compared the Games and the 
Khalmann-Finch scores of children from low and high socio- 
economic groups, and they found no significant differences 
between the two. Ludlow (1966) found that the Games were not 
superior to conventional test means, and furthermore that 
lower-class retarded children had significantly higher scores 
on the WISC Performance than on the Games. These children 
showed no significant differences on their scores between the 
Games and the SB, WISC, or the California Test of Mental 
Maturity (CMM) . It seems clear from these data that the 
Davis-Eells Games is not immune from culture as its authors 
would have us believe. 



Arguments : Pro and Con 

Now that we have looked at the format of some of the 
"culture-free" tests and some- of the empirical evidence, it 
is appropriate to investigate the pros and cons of such 
instruments. The first argument involves an indictment 
against all intelligence tests in general. Great fluctua- 
tions are found in IQ scores for children six to eighteen 

years of age. Data show the IQ is not a reliable concept 

* 

since the test which measures it, the grade level at which 
it is measured, and the norms by which it is evaluated can 
make a difference in "IQ". It has been found "that changes 
in mental test scores tend to be in the direction of the , 
family level, as judged by the parents' education and socio- 
economic status" (Honzik, Macfarland, & Allen, 1966, p. 172). 
It has been estimated that a score on a six-year test could 
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change 20 points for one out of three children by the age of 
18 , and 15 points for six out of ten children, thus seeming 
to invalidate the predictive validity of an intelligence test 
score which one is prone to take at face value (Honzik, etal. , 
1966). One basic violation that constantly takes place in 
evaluation is that testing provides an objective basis for 
assessment and evaluation of an individual's characteristics 
whereas in reality such testing is most valid for groups, and 
least valid for individuals . Tests can only give qualified 
information which is dependent on many possible sources of 
error for individuals. In all cases test scores should be 
analyzed in terms of the cultural grouping's overall perfor- 
mance, and the criterion of excellence as purported by the 
"average" should not be ideal. "The mathematical symbol 
becomes all too often the criterion of expected performance 
without any real reference to the behavioral phenomena needed 
for success in the cultural setting" (Barclay, 1968, p. 26). 

Vernon (1965) states that the cultural level of the 
home is the single most significant influence on the scores 
(even above SES) of intelligence tests. If this is true, 
and it is also true that a four-year-old's IQ correlates 
about .70 with late adolescence's (Cronbach, 1970), then 
equal education can be of very little help since the lower- 
class children are not adequately prepared for school, and 
the upper-class children are more than adequately prepared 
to assume the role of student. Strodtbeck (1965) has iso- 
lated some of the factors which create this situation. He 
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advocates the position that lower-class children are doomed 
to failure , because they do not develop test-taking skills, 
responsiveness to speed requirements, and familiarity with 
vocabulary. He feels that these are the very reasons why the 
lower-class child shows poor performance in the classroom, on 
the job, as well as on tests. The question is posed if the 
importance of verbal intelligence should be diminished in the 
school environment, or if the curriculum should be altered to 
develop verbal intelligence throughout the entire population. 
This inquiry leaves the concept of culture-free tests in a 
very tenuous position. 

Eells, et al . (1951) define other characteristics of 
the middle-class home which the lower-class home lacks. The 
child in the middle-class home learns to make a good impres- 
sion, and to develop certain attitudes toward himself, and 
toward task performance. These influence his response to 
tests and to school assignments which he learns to take 
seriously since he is constantly given tangible and intang- 
ible reinforcement, whereas the lower-class child learns to 
take his assignments lightly and just work to keep out of 
trouble. His rewards are the intangle ones which he gets 
from his peers and not from approving adults. 

Hertzig, et al . (1968) found similar evidence in a 

study of Puerto Rican preschoolers. Children in Puerto Rican 
homes find their reinforcement contingent on effective social 
interactions rather than on task mastery, which is the basis 
of our task-oriented schools. Since Puerto Rican and middle- 
class SES children differ at three years of age from each 
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other in their styles of response to demands for cognitive 
functioning, the continuation of this manifested in the school 
situation is practically assured and so is their failure in 
United States' schools. Although differences in behavioral 
style may have differential consequences, the presence of a 
difference should not be construed to mean that one pattern 
is superior or inferior to another. How effectively would a 
middle-class white student function in a South American school? 
Just as Hertzig et al . found a difference in behavioral styles 
of different ethnic groups, so did Lesser, Fifer, and Clark 
(1965) find that there are differences in the hierarchical 
patterns of intellectual skills between ethnic groups but not 
within them. The ramifications of this data belie issue of 
how culture-free test authors can presuppose common experi- 
ences if there may not be any at all in fact. 

There is some psychophysical data that suggest that 
some conceptual difficulties may originate in perceptual in- 
sensitivities (Farnham-Diggory , 1970). Perceptual difficul- 
ties have been found in disadvantaged children which makes it 
a problem for them to find similarities, differences, and 
relations of the part to the whole (Klaus & Gray, 1968). 

This type of evidence would make one wonder exactly how much 
can such a child discern in a test which relies purely on 
perceptual abilities, in fact, is known as a perceptual in- 
telligence test. If there were definitive evidence forth- 
coming regarding perceptual inadequacies, there would be two 
major types of research — why does it occur (organically based 
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or lack of experiential opportunities) and how to overcome 
these known difficulties through alternative routes which 
could compensate for these difficulties. 

Lorge (1945) supports the notion that education does 
make an inherent difference on intelligence tests. He found 
that those with greater education scored significantly higher 
than those with less education. The evidence comes lirom re- 
tests of equated groups of subjects after twenty years. 
Education increases the mastery of abilities measured by IQ 
tests while lack of further education diminishes the mastery 
of such abilities. His argument is that superior intellec- 
tual ability needs stimulation, and that full potentials may 
be lost in the absence of such stimulation. Along the same 
vein, Brazziel (1969, p. 207) says, "If conceptual learning 
is viewed as a gradual acculturation process and offered 
early in school careers, these children (inner city blacks) 
can be made to think." 

As if these complications would not be enough for the 
test writers to cope with, the mere idea of a strange exam- 
iner can often be found to be instrumental in depressing 
scores. It may be advisable in some -cases to spend some 
time with the subject until he is at ease with the examiner, 
or an examiner with the same background may be indicated. 
Abstract-type tasks may seem of little value for some to 
bother with since their culture may involve itself only with 
things of practical significance. If a subject does not have 
a familiarity with pictures and diagrams, or even paper and 
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pencil, scores which are reflective of the subject's ability 
may not be elicited (Anastasi, 1960). 

After this brief glance at some of the intervening 
variables which cause low scores on intelligence tests, one 
can understand why the search for a culture-free instrument 
would be desirable. However, one must realize that the pos- 
sibility exists that an intelligence test cannot be construc- 
ted which does not discriminate among classes since social 
class differences are real. Stroud (1957, p. STD-9A) issues 
an injunction against the concept of culture-free tests when 
he says that "...the cultural impact associated with social 
class differences may affect the course of mental development 
of children as well as their performance on intelligence 
tests." Bradfield and Moredock (1957, p. 375) give added 
credence to Stroud's thoughts when they say, "Speaking and 
reading, writing and listening, of all human behaviors, 
perhaps necessitates the most discrimination, memory, gen- 
eralization, etc., and hence you should expect the children 
more skilled verbally to be more intelligent." This may not 
be as far fetched as it first sounds, for if it is true that 
language is a tool for the development of the intellect as 
Piaget (1967) would have us believe, then those who do not 
have formal linguistic skills would not be as well equipped 
in the area of intellectual ability. Unfortunately formal 
linguistic skills taught at school in a foreign language 
(even blacks use another language at home) are not always 
flexible in nature, and therefore, they do not become part of 
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the intentional repertoire of these marginal children. 

Anastasi (1958a, p. 534) questions the feasibility of 
constructing a culture-free instrument on the grounds that 
perhaps the concept of intelligence is itself culturally 
conditioned and restricted. "It is not so much that tests 
are unfair to lower-status groups, as that lower-class envi- 
ronment is not conducive to the effective development of 
'intelligence 1 as defined in our culture." Lesser, et al . 
(1965, p. 12) quotes Lorge : 

There is no virtue in developing instruments so blunted 
that they decrease the amount of information. Perhaps 
the best method of reducing bias in tests of intelli- 
gence is to use them with the full knowledge that endow- 
ment interacting with opportunity produces a wide range 
of differences. Appraisal of the variation of different 
kinds of intellectual functioning requires many kinds of 
tests so that the differences can be utilized for the 
benefit of the individual and for the good of society. 
Intellectual functioning certainly does involve the 
ability to learn to adjust to the environment or to 
adapt the environment to individual needs and capabil- 
ities by the process of solving problems either directly 
or incidentally. Such a concept recognizes a variety of 
different kinds of problems. The full appreciation of 
the variety of aptitudes and the development of adequate 
methods for appraising them, should in the long run, 
ultimately lead to the production of enough information 
to eliminate bias. 

t 

To put it more succinctly, "...to rule out cultural 

* 

differentials from test items so as to make them equally 'fair' 
to subjects in different social classes or in different cul- 
tures may merely limit the usefulness of the test, since the 
same cultural differentials may operate within the broader 
area of behavior which the test is designed to sample." 
(Anastasi, 1958b, p. 202). 
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Culture-free intelligence tests may go on being a con- 
troversial issue in spite of all the logical arguments which 
oppose such instruments, but the fact that culture-sterile 
materials are only a stopgap measure should become apparent 
to all. The immediate suggestion for the long-range outlook 
would appear to be compensatory education, but one might ask 
in what language and whose curriculum. It seems valid to 
suggest that children from backgrounds with different lan- 
guages at home be taught bilingually until they are able to 
switch to either with equal facility. Culture infested tests 
of intelligence would then give a true picture of the progress 
of process and not of the stagnation which it now records. 

The most prolific authority on the ontogenesis of 
intelligence, Piaget (1967), posits that humans are endowed 
with two functional invariants , i.e., adaptation and organi- 
zation — two modus operandi , but that experience and environ- 
ment intervene to activate the mechanism, and if the unit is 
not frequently and well nourished, the mechanism will never 
develop its full potential. Anastasi (1960) notes that changes 
in intelligence test scores are influenced by intervening ex- 
periences which run the gamut from emotional to environmental 
experiences. In general, the underprivileged environment 
rears children who lose IQ points with age, and superior 
environments rear children who gain IQ points. Psychologists 
have found significant increases in mean scores which coincide 
with socioeconomic and educational improvements. 

i 
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Pepin (1971) quotes Bruner as saying that the objective 
of teaching should be to discover the limits of capacities, 
i.e., how far can the child go given the best instruction, and 
that he advocates teaching and testing alternately to show 
where the child is capable of going. The reality of social 
conditions must be corrected and not a score. The idea is to 
find out the educational diagnosis and then find a cure for 
the deficiencies rather than predicting the death of the in- 
tellect of the patient. Poor performance on IQ tests should 
be incentives and challenges to educators who should endeavor 
to modify scores in an upward direction. There is no damage 
done to the validity of a test if one coaches a student in 
making better test scores if such coaching is instrumental in 
improving a general area of intellectual skills as well as 
would be accomplished through a carefully constructed curri- 
culum. Just as content can be taught, so can process. 

Piaget (1970, p. 714) says that "...learning is no more 
than a sequence of cognitive development which is facilitated 
or' accelerated by experience." He contends that teaching 
through environmental activities and experiences can accelerate 
or complete structures which are being developed but that the 
order to succession will remain invariant. He is of the 
opinion that the optimal situation for learning is one in 
which the child discovers the new information himself (could 
this be the needed curriculum?) , but he does advocate teachers 
to devise situations in which there would be the opportunity 
for such unfolding to occur. He contends that exercise. 
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experience, and social action are the variants between here- 
dity and actualization. The educational implications are very 
grave ones, and it is not in the interest of children from any • 
social class to leave them go unheeded by alleviating the 
problem with a correction factor on a test. 



Attempts at Operational Definition 



The fact of mean group differences does not per se 
indicate that a test is unfair or biased for the group with 
the lower mean. Cronbach (1970) suggests that regression 
analysis (not t-testing) procedures are appropriate for exam- 
ining and determining test bias. Test bias has been defined 
by Cleary (1968, p. 115): 

A test is biased for members of a subgroup of the 
population if, in the prediction of a criterion for 
which the test is designed, consistent nonzero errors 
of prediction are made for members of the subgroups. 

In other words, the test is biased if the criterion 
score predicted from the common regression line is 
consistently too high or too low for members of the 
subgroups . 

As Linn and Werts (1971) point out, the critical component of 
the Cleary definition is that the criterion variable be free 
of bias. Rarely do we find in educational and psychological 
data criteria completely bias-free. The problem is further 
impacted by the unreliability of the criterion variable. 

Determination of test bias is clearly a function of 
test use and interpretation. Scores iri se do not determine 
test bias (Thorndike, 1971). We may say that test bias is 
situation— specif ic . A final note is the issue of criterion 
relevance. Frequently, a paper and pencil test used as the 
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criterion has little, if any, relationship to the tasks re- 
quired for job success. In this case it is inappropriate to 
label the test unfair. Its use should be termed irrelevant 
(Thorndike, 1971). 
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